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FUNCTIONAL SINGLE INDEX MODELS FOR 
LONGITUDINAL DATA 

By Ci-Ren Jiang and Jane-Ling Wang^ 

University of California, Berkeley and University of California, Davis 

A new single-index model that reflects the time-dynamic effects of 
the single index is proposed for longitudinal and functional response 
data, possibly measured with errors, for both longitudinal and time- 
invariant covariates. With appropriate initial estimates of the para- 
metric index, the proposed estimator is shown to be y'n-consistent 
and asymptotically normally distributed. We also address the non- 
parametric estimation of regression functions and provide estimates 
with optimal convergence rates. One advantage of the new approach 
is that the same bandwidth is used to estimate both the nonparamet- 
ric mean function and the parameter in the index. The finite-sample 
performance for the proposed procedure is studied numerically. 

1. Introduction. For univariate response variables Y with multivariate 
covariate Z € M^, the single- index model 

(1.1) E{Y\Z) = m{(3^Z) 

is an attractive dimension-reduction method to model the effect of multi- 
variate covariates nonparametrically. Since m{-), known as the link function, 
is an unknown smooth function, the scale of P^Z may be determined arbi- 
trarily. For identifiability reasons, /3q is often assumed to be a unit vector 
with nonnegative first coordinate. The primary parameter of interest is the 
coefficient Pq in the index (3qZ since /3o makes explicit the relationship 
between the response variable Y and the covariate Z. There are several 
different approaches to estimate /3o in (1-1), such as the projection pursuit 
regression [Friedman and Stuetzle (1981), Hall (1989)], average derivatives 
[Hardle and Stoker (1989), Ichimura (1993)] and partial least-squares [Naik 
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and Tsai (2000)] methods. Typically, the link function needs to be under- 
smoothed in order to estimate /3q at the y/n-iate. Hardle, Hall and Ichimura 
(1993) showed that a -y/n-consistent estimator of /3o can be achieved with- 
out undersmoothing the link function, that is, the same bandwidth can be 
used to estimate both the parameter /3o and the nonparametric link function 
m(-). However, their approach relies on a grid search to obtain the estimate 
for /3o and is time consuming when the dimension p is high. To overcome this 
drawback, and inspired by the sliced inverse regression method [Li (1991), 
Xia et al. (2002)] proposed a new method, called "conditional minimum av- 
erage variance estimation" (MAVE). Unlike most previous methods, MAVE 
does not need to undersmooth the nonparametric link function estimator to 
attain the y/n-iate consistency for the parametric index estimate. Also, it 
does not require strong assumptions on the distribution of the covariates. 
Theoretical results for this approach to single-index models are available in 
Xia (2006) and some extensions have been studied in Xia (2007) and Kong 
and Xia (2007), among others. However, none of these works addresses lon- 
gitudinal data, which is the focus of this paper. 

Our goal is to extend MAVE to the following single-index models for 
functional/longitudinal response data: 



where Y{t), t G T, is a stochastic process on a compact time interval T, 
Z contains p covariates, some or all of which may be stochastic functions 
over the time interval T, and, to be identifiable, /3o is a unit vector with 
nonnegative first coordinate. More specifically. 



where /i is an unknown bivariate link function and e(t, Z(t)) is a random 
function with mean that reflects the within-subject correlations of mea- 
surements and possibly measurement errors at different time points. Thus, 
there are two distinctive features in the functional single-index model (1.2), 
as compared to the traditional single-index model (1.1) considered in Xia 
(2006) and Xia et al. (2002). First, the functional single-index model accom- 
modates longitudinal response and longitudinal covariates, as well as vector 
covariates. Second, the effects of the single index and, consequently, covari- 
ates Z, may change over the time dynamic through a bivariate link function 
and this seems more realistic for longitudinal responses. 

Recently, Bai, Fung and Zhu (2009) combined penalized splines and quadratic 
inference functions to estimate the index coefficient and unknown link func- 
tion in a single-index model for longitudinal data. However, the link function 
in their model is univariate and thus does not reflect the dynamic effects of 
the single index. Moreover, their approach is restricted to generalized linear 



(1.2) 



E{Y{t)\Z{t)) = f^{t,PiZ{t)) 



(1.3) 



y(t)=/i(t,/3o^Z(t))+e(t,Z(t)), 
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models, where the variance function of the response is a known function of 
the mean function. In contrast, the hnk function in our model is an unknown 
function of time and the index, reflecting the dynamic feature of the effect of 
the single index, and the structure of the variance function is not restricted 
in our approach. 

The rest of this paper is organized as follows. Section 2 extends the original 
MAVE method to longitudinal data. Asymptotic theory for the proposed 
estimators is described in Section 3, with proofs in the Appendix. Practical 
implementations of the new approaches and simulation studies are presented 
in Section 4. In Section 5, we apply our method to two AIDS data sets: 
one with time-invariant covariate and the other also involving longitudinal 
covariates. Section 6 contains our conclusions. 

2. Methodology. We begin with the setting of model (1.1) for univariate 
response Y and multivariate covariate Z G M^. Let aj3{0^ Z) be the condi- 
tional variance of Y given 0^ Z. The true direction /3o in (1.1) is the solution 
of /3 that minimizes ¥.{ap{P'^Z)} = E{y - E(y|/3'^Z)}2. 

For a random sample, {(li, Zi),i = 1, . . . , n}, of (y, Z), E(y|/3'^Z) can be 
approximated locally at 0^ Zj by a linear expansion, that is, E(y|/3^Z) w 
aj + bJP^{Z — Zj). Empirically, ap{l3'^ Z) can be approximated at 0^ Zj 

by X]r=i[^« ~ {"^j + bjl^'^i^i ~ where Wij > are weights with 

Y17=i ^ij = 1' for example, Wij = Kh{(3'^{Zi - Zj)}/ Ya=i Kh{0^{Zi - Zj)}, 
where Kh{-) = h~'^K{- /h) and d is the dimension of K{-). Therefore, we can 
estimate /3o by solving the minimization problem 

(n n \ 

where a = (oi, . . . , a„) and b = (6i, . . . , 6„). Given /3, (2.1) is a local linear 
smoother of the data {yj,/3^(Zj — Zj)}, while, given a and b, (2.1) is just 
a weighted least-squares problem for /3. Consequently, the minimization in 
(2.1) can be viewed as a combination of nonparametric function estimation 
and parametric direction estimation. Furthermore, the weights can be up- 
dated iteratively via the relation Wij = Kh{f3'^{Zi — -^j)}/ Kh{l5'^ {Zi — 
Zj)}, using the current estimate /3, then updating the estimate of /3o by 
minimizing (2.1) with Wij replaced by wij. This could be repeated until (3 
converges and is called refined MAVE (rMAVE) in Xia et al. (2002). 

2.1. Estimation. Hereafter, the response will be longitudinal data, which 
typically consists of random fluctuations or measurement errors. Let Yij = 
Yi{Tij) be the jth observation for the iih. subject, made at a random time 
Tij G T, where T is an interval. Along with the responses, we have infor- 
mation on p covariates, some of which may be longitudinal covariates. Since 
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a univariate covariate can be considered a special case of a longitudinal 
covariate with constant value, we will adopt the notation for longitudinal 
covariates and define Zij = Zi{Tij) S W, i = 1, . . . ,n and j = 1, . . . , Ni, as 
the p-dimensional covariate for the ith subject evaluated at time Tij. The 
functional single-index model (1.3) applied to the observed longitudinal and 
covariates data leads to 

For simplicity, we only consider bounded covariates Z when deriving theo- 
retical properties, even though our simulation study shows that the method 
could work well for unbounded covariates. The bounded assumption is com- 
monly adopted in the literature, for example, in Hardle, Hall and Ichimura 
(1993) and Hardle and Stoker (1989). Here, we assume that the measure- 
ment times Tij are a random sample of size Ni, assumed to be i.i.d. and 
independent of all other random variables. 

The two main steps in our approach are to estimate the direction /3q 
and the mean function In particular, we show how to estimate the para- 
metric index Pq by adapting rMAVE for longitudinal data. The asymptotic 
distribution of /3 is studied in Section 3 for both longitudinal and time- 
invariant covariates. The mean function can then be estimated through a 
two-dimensional scatter plot smoother of Yij on [Tij, (3'^ Zij) when /3 is avail- 
able. 

To estimate the parametric index efficiently, we extend rMAVE to lon- 
gitudinal data. For simplicity, and to avoid the curse of dimensionality, we 
only consider a single index in our model. Therefore, /3 is a vector instead of 
a matrix. As in MAVE, for any given [Tji,Zji), K(Yik\Tik, P'^ Zn^) can be ap- 
proximated by a linear expansion at (Tj£, P'^Zji), that is, K{Yik\Tik, P'^ Zik) « 
aj£ + bji{Tik — Tji) + dje(3'^{Zik — Zjg). Similarly, the conditional covari- 
ance, a/^iTik, /3'^ Zik) = E{Yik -KiYiklTik, f3'^ Zn,)}'^ , can be approximated by 
EiLi E^iKfc - Wji + bjiiTik - Tji) + dji/3'^{Zik - Zji)}fwikji, where 

K{{Tik - Tje)/ht, [P^jZik - Zji))IK) 
EILi Ef=i mTi^ - T,d/hu {F{Zik - Z^,))/K) ' 

n Ni 



^^Wikjl = l. 



i=l k=l 

Here, K{-) is a two-dimensional kernel function of order (0,2) defined in 
Appendix C with compact support that is also a symmetric density func- 
tion with finite moments of all orders and bounded derivatives; ht and 
are the respective bandwidths for smoothing along the time (t) and single- 
index covariate {0^ z) direction. We can then estimate /3o by solving the 
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minimization problem 

«™i^^d ( X] ~ ^"i^ + bje{Tik - Tji) 

' \j=i 1=1 1=1 k=i 

(2.3) 

Suppose that we have a current estimator /3 of /3o and current refined weights 
Wikji- The estimate for Pq will be updated by minimizing equation (2.3) with 
Wikji replaced by Wikji- This procedure will be repeated until /3 converges. 
The final estimate, /3, can then be used to estimate the mean function fi via 
a two-dimensional smoother that has the same bandwidth as the weights in 

(2.3) , that is, 

fL{t, p z) = 6o where, for b = {bo,bi,b2), 

(2.4) b = argminX:E/4^'^^^^| 

b .^^^.^1 I rit n, ) 

X {Yij -bo- bi{Tij -t)- b2F{Zij - z)}\ 

2.2. Algorithm. Let ht and hz be the bandwidths for T and re- 
spectively, and let o"^ denote the quantity to be minimized in (2.3), which 
is within the parentheses. Define Kh{t,z) = K (t / ht, z / hz) / (hthz) . 

1. Start with an initial value of /3, say /3(o)- 

2. Use the current estimate /3(m) and weighted least-squares method to ob- 
tain (a, b, d) = arg miua b d <5"1 , where 

' ' P(m) 

Wikji = Kh{{tik — tji)-,Pjrn)i^ik ~ ^ji)} 
n Ni 

/^Yl ^hiikk - tje)J'[m) i^ik - Zjl)}. 
1=1 k=l 

3. Use the estimates (a, b, d) from step 2 to obtain the updated estimate 
^irn+i) =argmin^(7^. 

4. Repeat steps 2 and 3 until ||/3(m_|_i) — fi(jn)\\ < where e is some given 
tolerance value. 

5. The final estimate of /3 from step 4 is then used to reach the final estimate 
of the mean function defined in (2.4). 
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2.3. Bandwidth selection. Instead of selecting the bandwidths by the 
leave-one-curve-out cross-validation method suggested in Rice and Silver- 
man (1991), we choose the bandwidths for the mean function estimator 
via an m-fold cross-validation procedure to reduce the computational cost. 
Below, we describe the m-fold cross-validation method for the bandwidth 
selection for ii{t,fi'^z). Supposing that subjects are randomly divided into 
m groups, (S'l, ^2, . . . , Sm)-, the m-fold cross-validation bandwidth is 

■m 

(2.5) V = argmin ^ ^{F,, - i,^~'^^\T,,,^f Zij)]\ 

^ e=l i&Se j=l 

where fi^~^^\Tij , (3'^ Zij) is the estimated mean function at {Tij,p^ Zij), ex- 
cluding subjects in S^. 



3. Asymptotic results. We assume that {Tij , Zij ,Yij) have the same dis- 
tribution as {T,Z,Y) with joint probability density function g^[t,z,y) and 
that the observational times Tij are i.i.d. with probability density function 
g{t), but dependency is allowed among observations from the same subject. 
Let z = fS'^z, z^ = 0Q z and let f2{t,z) and f^{t,z^y) be the joint densities 
of {T,Z) and {T,Z,Y), respectively. The kernel function is assumed to be 
symmetric. For simplicity, we also assume that J v^Kiu^v) = J v^K{u,v) = 
j u'^v'^Kiu, -y) = 1 as, without loss of generality, any symmetric density ker- 
nel function can be applied after proper normalization. Since we are in- 
terested in the asymptotic distribution of /3, similar to the assumption in 
Hardle, Hall and Ichimura (1993), we assume that the initial value /3(o) is 
in a -y/n-neighbor of Pq. This assumption is for technical convenience; in the 
simulations, an arbitrary initial value was used and it performed well. To be 
prudent, one may want to try different random initial /3(o) and choose the 
final estimate as the one that leads to the smallest value in the minimization 
problem of (2.3). In the data analysis, we chose ten different initial values 
for /3(o) and they all converged to the same estimate /3. 

From the iterative algorithm in Section 2.2, the updated (3 from minimiz- 
ing (2.3) after one iteration will become 

/3 = /3o + {£'^}-^T + Op(n-i/2) ^^g^g 

i=l k=lj=ll=l J'i\''^ik, ^ik) 

(3.1) _ _ 

X Kh{{Tji — Tik), {Zj£ — Zik)} 

X {Zji — Zik){Zji - Zik)'^ , 
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dpiTji, Zji) 



fik=ij=i e=i h{Tji,Zji) 

^ Kh{{Tik — Tji), {Zik — Zji)}{Zik — Zji) 

(3.2) x{Yik-afi{Tje,Zje) 

— bfsiTji, Zji){Tik — Tji) 

- dp(Tji, Zji){Zi,^ - Zji)}, 

f2{t, z) is the estimate of f2{t, z) and a^'s, 6^'s and d^'s are the coefficients of 
the hnear approximation, as defined in Section 2. By means of some tedious 
calculations [sketches of proofs of (3.3) and (3.4) are in Appendix B with 
assumptions A.1-A.6 listed in Appendix A], we can obtain the following 
approximations of {Dn}~^ and T: 

(3.3) {D^J-' ^ M _ ^{G^F^f^J + /3oFG+) + 1(5+ + 0,{h + 5,), 

T = G{P- /5o) - (nEiV)-i Y.Y.\ '"Poi^ik, Z,k) ^ \e^k 

1=1 k=i ^ ^ ^ 

(3.4) 

+ 0p(n-V2), 

where G = E{{d^i/dz^)^G{Z)}/2, G+ = Bo{B^GBo)~^B^ is the Moore- 
Penrose inverse of G with {/3q,Bq) an orthogonal matrix, r = K{{d/j,/dz^)'^}h^, 
F = E{{dfi/d~z^)^Fp{T,Z)}, Fp{t,z) = §,^{h{trz)^J{t,z)}/ f2{trz). G{z) = 
¥.{{Z^k - z){Zik - ^)^}, yp,{t,z) = E(Z|T = t,P^Z = P^z) - z and 5p = 
1/3 -/3o|. 

After plugging (3.3) and (3.4) into (3.1), we obtain 
^ = /3o + { ^ - ^{G+F^PI + /3oFG+) + \g^ + 0,{h + 5p) 

n Ni ^ p. \ 

X G'(/3-/3o) - {n¥.N)-^Y.Y.\ ^pATik,Zik) ^\e^k + Op{n-^'^) 

i=i k=i ^ ^ ^ 

+ Op{n-'/') 

= m + cn) + ^{I - Ml){P - M - EE ^^o^k, Z,,) ^ \e,k 

i=l k=l ^ ^ 

+ Op(n-V2), 
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where = hlFG+[{rMNr^ Zti EkllMT^k, Zik){dfi/dz^)}e,k - G{(3 - 
/3o)]/(2r). 

Since |/3| = 1, /3 needs to be standardized. From the above calculation, 
|/3| = l + Cn + 0p(n-V2) so 

^ = /?o + ^(/-/3o/3o'^)(/3-/3o) 
iPl ^ 

Therefore, in the (m + l)th iteration, 

/3(m+i) = /5o + 2 ~ /?o/5(f )(^(m) - /5o) 

i=l k=l ^ ^ 
= /3o + ^U-/3o/5(f)(/3(i)-/5o) 

\i=i / j=i fc=i ^ 

+ Op(n-^/2). 

Consequently, as the iteration m — >■ oo. Lemma D.l in Appendix D implies 
the following theorem. 

Theorem 3.1. Under assumptions A.1-A.6, 

V^(/3-/3o)^A^p(0,S), 
where S = G+^*G+ and S* = mhlE[{§^ i,f,^{T, Z)e}{§^ u^.iT' , Z') x 

In practice, the covariance of /3 in Theorem 3.1 is unknown and needs to be 
estimated to make inference on f3. The idea is to replace the unknown values 
with consistent estimates. First, E(A^) can be estimated by iV = ^^^^ N-Jn 
and G can be estimated by 

n Ni . n >. 2 

6 = ^^! — A(r,fc,/3^Z,fc)| G{Z,,k)/i2nN), 
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where G{Zik) = ^ YJj=i YJi=i{^ji - Zik){Zji - Zikf . To estimate S*, we 
estimate vi3^{T,Z) at all {Tik,Zik) by (3.5), estimate e at {Tik,Zik) by the 
residual, Yif^ — fi(Tif^, fS'^ Zif^) , and average the product terms in S*. Therefore, 



N- 



N \ N* ^ ^ *M N\nN 

(. 1=1 l<j^k<Ni J I 1=1 k=l 



where Hik = § fi{TikJ^ Z,k)u^^^{Tik, Z,k)eik and N* = ELi^/ " ^i- To 
estimate vp^^{Tik, Zik) =E{Z\T = Tik, (Sq Z = Zik) - Zik, we can, for sim- 
plicity, apply a weighted average estimator on the observations in the neigh- 
borhood of {Tik, (3q Zik), which leads to 

(3 5) i>/3 (T-fc Z k) = ^ -^fe { (^j^ -Tik), {Zje - Zjk)} ^ ^ _ 

Y^j,e Kh{{Tji - Tik),f3'^{Zji - Zik)} 

More sophisticated procedures might be considered to estimate the above 
unknown values. 

Before showing the asymptotic property of the local linear smoother, fi(t, 
z{t)), we first need the asymptotic property of the local linear smoother, 

fi{t,u{t)), where u is a univariate longitudinal covariate. This is provided in 

Theorem 3.2 below, with the proof in Appendix C. 

Theorem 3.2. Under assumptions A.1-A.6, hz/ht — > p and nE{N)h^ — ^ 
for some < p, r < oo and 



y'nNhthz[fi{t,u) - p{t,u)] A^(r?(t, n), n)), 

where v{t,u) = ^-^{^ + ^p^}, J:,{t,u) = [var{Y\t,u)\\K2p]/ h{t,u), 
||-f^2|P = / K2 and f2{t,u) is the joint density of {T,U). 



It is interesting that the asymptotic bias term in Theorem 3.2 depends 
on the ratio h^/ht. This is due to the assumption that nE{N)hf — )■ and 
assumption A.l in Appendix A, which requires ht and hz to have the same 
rate. After some Taylor expansions, these two assumptions on the band- 
widths lead to the asymptotic bias term in Theorem 3.2. The assumption 
that the two bandwidths, h^ and ht, have the same rate is natural since the 
mean function p{t, z) has the same order of smoothness along both the t 
and z coordinates. 

Since /3 is a -^/n-consistent estimator of /3o and by Theorem 3.2, the asymp- 
totic properties of the local linear smoothers for the mean can be obtained. 
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Theorem 3.3. Under assumptions A.1-A.6, hz/ht — ^ p andnE{N)h^ — ^ 
for some < p,T < oo, and 

VnNhthMt, P^z) - fi{t, (3^z)} ^ N{A^, S^), 

where = + ^p^}, = [v^r{Y\t, (3j z)\\K2f]/ f2it, (3^ z) and 

\\K2r=jKi 

4. Simulation study. Two simulation schemes are considered in this pa- 
per. One considers the case with time-invariant covariates; the other con- 
siders longitudinal covariates. In both simulation studies, /3(o) = (I/a/p)-'-?' 
where Ip is a p-dimensional vector with entries all equal to 1, was used as 
the initial value of /3, the number of runs was 100 and the number of subject 
for each run was n = 100. 

4.1. Simulation 1: Time-invariant covariate. The covariate for each sub- 
ject (Zi,Z2) is generated from Zi ~ Bernoulli (with probability of suc- 
cess 0.5) and ~ iV5(0,*), where = 0.5 x I5 + 0.5 x I5I5 . We choose 
= (2, 1,0,3,0, —1)/VT5. Given a subject with covariate z = (zi,Z2)^/3oi 
the stochastic process Y* is generated from a Gaussian process on [0, 1] 
with mean function p{t, z) = sin(i) sin(t7r) + {1 — sin(z)} cos(t-7r) and covari- 
ance function T{s,t) = (l/4)(/)i(i)(/)i(s) + (l/16)(/)2(t)(?!'2(s), where (t)i{t) = 
— sin(7rt)\/2 and ct>2{t) = cos(7ri)\/2 are the eigenfunctions of F, with cor- 
responding eigenvalues 1/4 and 1/16, respectively. The measurement errors 
are assumed to be normally distributed with mean and variance 0.01. Note 
that the variance of measurement error is not very small compared to the 
scales of the mean function and the eigenvalues. 

For the measurement schedule, we use a "jittered" design with an equally 
spaced grid {cq, . . . ,050} on [0, 1] (cq = and C50 = 1) and then jitter each 
point Cj by Sj = q + e^, where ej are i.i.d. with A^(0, 0.0001), Sj = if < 
and Si = l if Sj > 1. This resulted in a jittered schedule that is no longer 
equally spaced; from there, a random sample of size iVj is selected from 
{si,...,S49} without replacement to serve as the Ni measurement sched- 
ule for the ith subject, where Ni is itself sampled from a discrete uniform 
distribution {2, . . . , 10}. 

We experimented with several m-fold cross-validation (CV) methods and 
found the 10-fold method to be satisfactory. Table 1 reports the results for 
m = 3 and 10. The results for the parametric estimate /3 are comparable for 
3- and 10-fold CVs with the 10-fold CV being somewhat better. In terms 
of estimating the mean function, the 10-fold CV performs better. Figure 1 
also suggests good performance of the 10-fold CV method in terms of bias. 
The plot for the 3-fold CV is similar, but is not provided. 
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Table 1 

Performances of estimators for 3- and 10-fold CV. (Measures of differences between j3 
and P: || • || measures the difference in the Euclidean norm and cos~^ in terms of the 
angle between /3 and jS.) The IMSE is the integrated mean squared error, defined as 
JJ[p,{t,u) — fj,{t,u)]^dtdu. Note that the IMSE of p, in simulation II is much larger than 
that in simulation I, due to different scales of t and 0^ z, and different mean function 



Simulation 


CV 


11/3 -/3|| 


cos-^(/3^/3) 


IMSE(/i) 


I 


3 


0.2121 (0.0867) 


12.1900 (5.0161) 


0.0312 




10 


0.2121 (0.0793) 


12.1860 (4.5833) 


0.0237 


II 


3 


0.2575 (0.1923) 


14.8944 (11.3004) 


0.2723 




10 


0.2501 (0.1914) 


14.4675 (11.2666) 


0.2257 



4.2. Simulation 11: Longitudinal covariates. This simulation scheme is 
inspired by the CD4+ cell counts data from the Multicenter AIDS Cohort 
Study or MACS [Kaslow et al. (1987)], which is analyzed in Section 5. There 
are five covariates in this AIDS data: age at seroconversion and four longi- 
tudinal covariates [packs of cigarettes, recreational drug use (1: yes, 0: no), 
number of sexual partners and mental illness scores (CESD), larger values 
indicate increased depressive symptoms]. In the simulation, the covariate 
values were based on the five covariates from 100 randomly selected sub- 
jects. The mean function, coefficient of the index, two eigenfunctions and 
two eigenvalues were also chosen to mimic the corresponding values of the 




Fig. 1. True mean function, averaged estimated mean function and bias in simulation 
study I. 
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True Mean Estimated Mean Bias of the Mean Estimator 




Fig. 2. True mean function, averaged estimated mean function and bias in simulation 
study II. 



real data (see Section 5). Therefore, we choose the mean function 

where the index coefficient jS^ = (0.1043,0.5213,0.8341,-0.1043,-0.1043) 
and t G [—3, 5.5]. The two eigenfunctions are (f)i (t) = cos{(t + 3)7r/8.5}/V4:25, 
and (l)2{t) = — sin{(t + 3)7r/8.5}/\/4.25, with respective eigenvalues Ai = 2 
and A2 = 0.5. For each subject, the two principal component scores are gen- 
erated from iV(0,Ai) and A^(0,A2). Also, normally distributed measurement 
errors with mean zero and variance 0.1 are added. 

Consistent with the results in simulation I, where the covariates are 
time invariant, the results in Table 1 for estimating f3 and mean function 
fi{t, (3'^ z{t)) are also comparable for the 3-fold and 10-fold CVs, with 10-fold 
CV slightly better. Again, we only provide the plot of estimates based on the 
10-fold CV. Other than the boundary. Figure 2 suggests good performance 
of the 10-fold CV method in terms of bias. The boundary effect appears 
to be due to the sparsity of the data and is more prominent than in the 
previous simulation setting of time-invariant covariates. The observed P'^z 
are very sparse near the boundaries. 

5. Application. We illustrate the methodology via CD4-I- cell counts 
data from the Multicenter AIDS Cohort Study or MACS [Kaslow et al. 
(1987)]. HIV destroys CD4 cells, which play a vital role in the immune sys- 
tem. The CD4 cell count is thus a good marker for disease progress. The 
number of CD4 cells might also be related to some subject-specific factors 
such as smoking, age, etc. In the first example, we apply our approach to 
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the CD4 data analyzed in Wu and Chiang (2000), where the covariates are 
time invariant. The second example is the CD4 data analyzed in Zeger and 
Diggle (1994), where longitudinal covariate variables are available. 

5.1. Example I; Time-invariant covariates. This data set involves 1817 
measurements of CD4 percentages, which are cell counts divided by the to- 
tal number of lymphocytes, observed for 283 homosexual men who became 
HIV positive between 1984 and 1991. The measurements were scheduled at 
each half-yearly visit; however, the actual measurement times may vary and 
some subjects missed some of their scheduled visits. The resulting measure- 
ment times tij per subject are irregular and sparse. Three time-independent 
covariate variables are considered in our analysis: smoking status (1: yes, 0: 
no), age at HIV infection and pre-HIV infection CD4 percentage. To make 
the scales of different covariates compatible, we standardize age and pre-HIV 
infection CD4 percentage. Similarly to the simulation study, we use 3-fold 
and 10-fold CV to choose the bandwidths for the nonparametric procedures. 

To avoid being trapped in a local minimum, we choose ten different ran- 
dom initial values for ^5(0) = (/3i(o))/32(o);/33(o))'^) &s follows. First, we pick 
five different values (0.1, 0.3, 0.5, 0.7 and 0.9) for /3i(o) and generate /32{o) 
from C/(0, ~ /3i(o) ) i then we set /3^q^ = — P^o) ~ ^2{o) *° ensure that 
11/3(0)11 = 1. The signs of /32(o) and /33(o) are initially randomly assigned and 
then flipped to make up for the ten initial /3(o)- These ten initial values all 
lead to the same /3. 

Several statistical models have been applied to this data set, such as 
varying coefficient models in Wu and Chiang (2000). In their analysis, only 
the effect of pre-infection CD4 percentage was found to be significant and 
positive, but none of the covariate effects seem time-dependent [see Figures 
1 and 2 in Wu and Chiang (2000)]. This result is consistent with our findings 
in Table 2 and Figure 3. We find that people who smoke, who are young 



Table 2 

Estimated parametric index $ and asymptotic covariance of /3 for example I 
[here, hf^ = {ht,hz) is the bandwidth for estimating fi and ^{1,0^ z)] 



CV 3 10 

(2.14,2.80) (1.70,5.00) 

(0.0727,-0.1074,0.9916) (0.0877,-0.1076,0.9903) 

/ 0.4213 0.0141 -0.0796 \ / 0.4137 0.0103 -0.1072 \ 

Var(/3) ^ ^ 0.0141 0.0887 -0.0161 0.0103 0.0898 -0.0128 

\ -0.0796 -0.0161 0.0602 / \ -0.1072 -0.0128 0.0932 / 
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Estimated Mean Surface Mean Cun/es for Different p^Z 




Year 

Fig. 3. Estimated mean function of AIDS data for example I. 

when they get the HIV infection and who have higher pre-HIV infection 
CD4 percentages tend to have higher post-HIV infection CD4 percentages on 
average. However, only pre-HIV infection CD4 percentage is significant. In 
the right panel of Figure 3, we observe that, in general, the CD4 percentages 
deplete rather quickly at the beginning of HIV infection and the rates of 
depletion during the first 2.5 years are generally higher than in later years. 
However, the time when the rate of depletion slows down varies with the 
levels of P'^z. More specifically, when (3'^ z is larger, the rate of depletion 
slows down earlier. 

5.2. Example 11: Time-invariant and longitudinal covariates. In this data 
set, 2376 CD4 observations on 369 subjects were made and the times of ob- 
servation ranged from 3 years before to 6 years after seroconversion. Five 
covariates considered in this analysis are age, packs of cigarettes, recreational 
drug use (1: yes, 0: no), number of sexual partners and mental illness scores 
(CESD) (larger values indicate increased depressive symptoms). Except for 
age, the other four covariates are longitudinal. As in example I, we applied 
3- and 10-fold CVs to choose the bandwidths in nonparametric procedures 
and adopted the same strategy to select 10 initial values for /3q. It turned 
out that all ten random initial /3(o)'s lead to the same /3. 

Previous analysis for this data includes the semiparametric models in 
Zeger and Diggle (1994), where age, smoking, recreational drug use and 
increased numbers of sexual partners are associated with higher CD4 cell 
numbers, while increased depressive symptoms are associated with decreased 
CD4 levels, but the effects of age and recreation drug use are not significant. 

In our analysis, among these five covariates, the effect of packs of cigarettes 
smoked per day is the most significant. Moreover, our analysis in Table 3 
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Table 3 

Estimated parametric index P and asymptotic covariance of /3 for example II 
[here, hp, = {ht,hz) is the bandwidth for estimating /3 and ij,{t, (3'^ z)] 

3-fold CV 



hp 

Var(/3)~ ^ 



(1.25,3.00) 
(0.0141, 0.5700, 0.8211,-0.0159, 



-0.0216) 



0.0035 
0.0045 
0.0210 
-0.0010 
-0.0003 



0.0045 
0.0956 
0.2733 

-0.0049 
-0.0038 



0.0210 
0.2733 
2.4311 
0.0029 
-0.0214 



-0.0010 
-0.0049 
0.0029 
0.0069 
-0.0009 



-0.0003' 
-0.0038 
-0.0214 
-0.0009 
0.0021 , 



10-fold CV 



Var(^)-^ 



(1.00,4.00) 
(0.0128, 0.5530, 0.8326, -0.0193, -0.0225) 



/ 0.0037 
' 0.0070 
0.0284 
-0.0011 
V -0.0005 



0.0070 
0.1287 
0.3744 
-0.0065 
-0.0061 



0.0284 
0.3744 
2.7206 
-0.0018 
-0.0302 



-0.0011 
-0.0065 
-0.0018 
0.0070 
-0.0008 



-0.0005 
-0.0061 
-0.0302 
-0.0008 
0.0023 / 



suggests that an increasing number of sexual partners is negatively asso- 
ciated with CD4 counts, which seems more reasonable than the previous 
result. From Table 3 and Figure 4, we also observe higher mean CD4 cell 
numbers when subjects are older, smoke more, use recreational drugs and 
have lower CESD. The right panel of Figure 4 suggests a big decline of CD4 

Estimated Mean Surface ^ ^ Mean Curve of Different 




Fig. 4. Estimated mean function of AIDS data for example II. 
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cell counts half a year before the seroconversion and one year after serocon- 
version. After one year of seroconversion, the decline slows down slightly. 
The trough at the end might be due to the boundary effect. 

6. Discussion and conclusions. The proposed estimate of the single index 
for longitudinal response data works well in our simulations and data appli- 
cation. It has the advantage that the same bandwidth is used to estimate 
the nonparametric mean function and the single index without the need 
to undersmooth the mean function in order to achieve the i/n-convergence 
rate, as is often the case in semiparametric regression models with inde- 
pendent response data. This leads to a unified approach to selecting the 
bandwidth. Additional computational savings are accomplished through the 
m-fold cross-validation methods. The simulation results reported in Section 
4 suggest that the performance of the procedure is not very sensitive to the 
choice of m and the initial value /3(o) ■ 

We have derived asymptotic distributions for both the parametric (/3) and 
nonparametric (//) components of the model and illustrate its usefulness for 
statistical inference via an AIDS data set. While it is possible to extend the 
approach to multiple indices, such an extension would be computationally 
intensive and subject to the curse of dimensionality. 

An additive model 

E(y(i)) = Kt, 1^1 z{t)) = fitit) + f^Mzit)) 

can be viewed as a special case of model (1.3), and if the model is actually 
additive, our approach can be modified to estimate the parametric and non- 
parametric components easily. To estimate /3 and the two functions fitit) 
and iiz{0Q Z{t)), we can perform a two-step procedure. First, apply a one- 
dimensional scatter plot smoother to {{Yij,Tij)\i = l,...,n;j = l,...,Ni} 
to estimate /^t(t). Then, apply modified rMAVE (2.3) to the residuals to 
estimate /3o- Specifically, /3q can be estimated by solving the minimization 
problem 

' ' \j=l l=\ i=\ k=l I 

where Y^y. = Yik — fit{Tik) for 1 < k < Ni and 1 <i <n. 

Model (1.3) extends the popular single-index model from independent 
univariate to longitudinal response data. Our extension allows both time- 
independent and longitudinal covariates, but restricts the effect of the co- 
variates to be time invariant. Such a time-invariant approach is in line with 
the philosophy in linear mixed-effects model, where the covariate effect is 
time invariant. In this regard, our approach could be viewed as an extension 
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of a parametric linear mixed-effect model to a more flexible semiparamet- 
ric mixed-effects model. While such an extension may still be considered 
restrictive, as compared to an approach that adopts a time-dependent di- 
rection /3(t) to model the covariate effects, the time-invariant direction /? in 
our model has a nice interpretation as the averaging covariate effect over 
time. Thus, /? as an average of /3{t) serves as a simple summary measure 
for the (possibly more complicated) time-dependent covariate effects. More- 
over, a time-dependent approach would require a lot more data to correctly 
estimate the direction P{t). When circumstances allow, one way to extend 
our approach to time-dependent direction f3{t) is to adopt a two-stage pro- 
cedure: at the first stage, one bins the data in the direction of time and 
then applies rMAVE to data that are observed in a bin that contains t to 
obtain an initial estimate of (3{t); these are smoothed at the second stage to 
improve over the initial estimates. This is a subject for future investigation 
and is beyond the scope of this paper. 

Thus far, we have focused on estimation of the unknown components in 
the functional single-index model. A functional principal component analy- 
sis (FPCA) could be added after the covariate-adjusted mean function has 
been estimated. The mean-adjusted FPCA (mFPCA) proposed in Jiang 
and Wang (2010) can be used to reconstruct the random trajectories. More 
specifically, we can first show that the asymptotic properties of the covari- 
ance estimator in Theorem 3.5 of Jiang and Wang (2010) hold by exploit- 
ing the -^/n-consistency of /3. Then, the eigenvalues and eigenfunctions cor- 
responding to the estimated covariance can be estimated by solving the 
eigenequation, and the PACE method proposed in Yao, Miiller and Wang 
(2005) can be used to estimate the principal component scores and to select 
the number of components. 

APPENDIX A: ASSUMPTIONS 

The following type of continuity, as defined in Yao (2007), will be needed: 
a real function f{x,y) :M"'"'"™ — )• R is continuous on A C M", uniformly in 
y G M"*, if, given any x £ A and d <0, there exists a neighborhood of x not 
depending on y, say U{x), such that \f{x',y) — f{x,y)\ < 6 for all x' £ U{x) 
and y £ W^. Our proofs cover both time-independent and time-dependent 
covariates with slightly different assumptions and arguments. For both cases, 
we assume the observation times Tij are i.i.d. with probability density func- 
tion f{t). For a random design with time-invariant covariates, we assume 
that (Tij,Zi,Yij) have the same distribution as {T,Z,Y) with joint proba- 
bility density function f^{t,z,y), but dependency is allowed among observa- 
tions from the same subject. The joint probability density functions of (T, Z) 
and (Ti, Z, Yi, I2) are denoted as f2{t,z) and f5{ti,t2,z,yi,y2), respec- 
tively. If the covariate is longitudinal, then we assume that (T^j , Zij ,Yij) have 
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the same distribution as {T,Z,Y) with joint probabihty density function 
f3{t,z,y) and the joint probabihty density function of (Ti, T2, Zi, Z2, 1^, ^2) 
is fe{h,t2,zi,z2,yi,y2)- 

Below, we describe the various assumptions that appear in the theorems. 

A.l /it X /i^ X /i, /i ^ 0, nE{N)h'^ 00, E{N)h'^ and nE{N)h^ < 00. 

A.l' /it X /i^ X /i, /i ^ 0, 7iE(7V)/i2 00, E{N)h and nE{N)h^ < 00. 

A. 2 The number of observations Ni{n) for the ith subject is a random vari- 
able with Ni{n) ~ N{n), where N{n) is a positive integer- valued ran- 
dom variable such that limsup„_^o^ EA^(n)^/[EiV(7T,)]^ < 00 and Ni{n), 
i = 1, . . . ,n, are i.i.d. 

A. 3 The conditional mean fi{t, /3"^z) = E(y|r = t, fS'^Z = fS'^z) and its deriva- 
tives up to second order are continuous on {{t,z)} and its derivatives 
up to the third order are bounded for all /3 : |/3 — /3o| < (5, for a 5 > 0. 

A. 4 The joint probability density function f2{t,z) and its derivatives up 
to third order are bounded, up to second order they are continuous on 
{(t, z)} and f2{t, z) > is bounded away from zero for all /3 : |/3 — /3o| < 6, 
for a 5 > 0. 

A. 5 The joint probability density function f3{t,z,y) and its derivatives up 

to second order exist and are continuous on {(t, z)}, uniformly in y G M 

for ah /3:|/3-/3o| <5, for aS>0. 
A.6 f6{ti,t2,zi,Z2,yi,y2) is continuous on {(ti,i2,zi,Z2)}, uniformly in (yi, 

y2) e for ah /3 : |/3 - ^o| < for a 5 > 0. 
A.6' f5{h,t2,z,yi,y2) is continuous on {{ti,t2,z)}, uniformly in (yi,y2) G 

M2 for ah /3 : 1/3 - /3o| < 6, for a 5 > 0. 

The bandwidth assumption A.l and the assumption on the measurement 
schedule A. 2 are required to show that the usual local properties of the ker- 
nel estimators hold for longitudinal or functional data in the presence of 
within-subject correlation. Assumptions A.3-A.6 are regularity conditions 
for joint probability density functions and the mean function. These reg- 
ularity conditions, along with the bandwidth assumption A.l, are needed 
for the consistency results. A.l' and A.6' are the assumptions when the co- 
variate variables are time invariant. Adopting assumption A. 4 is common 
practice in the theory of regression estimation to study estimators on sets 
bounded away from the troublesome regions [e.g.. Hall (1989), Hardle, Hall 
and Ichimura (1993) and Xia et al. (2002)]. 

APPENDIX B: PROOFS OF (3.3) AND (3.4) 

Proof of (3.3). Let {/3,B) be an orthogonal matrix and, by Lemma 
D.2, we obtain 

n Ni 

{nENf2{t, z)}-^ Yl H ^h{Tij - t, Zij - z){Zij - z){Zij - zf 
i=i j=i 
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(B.l) 



{n¥.Nh{t, Yl ^h^j - t, 4- - 5)(/3, B) ( 

i=i j=i ^ 



xiZ,,-z){Z,,-zf{(3,B)(^f^Jr^ 



{13, B) 



Fpit,z)Bhl 



B^Fjit, z)hl B^G{z)B + Op{h^)J \ B 

d f f ,,T/ 







where =E{(Z,j - z){Zij - z)' } and = ^(/ai^^ {t,z))/f2. 

Next, consider 



= {nEN}-' Y E 4(.T,k,Zik){nENf2{Tik,Z~,k)}-' 



1=1 k=l 



X E E Kh{Tji — Tik,Zjg — Zik) 
j=i e=i 

X (Zje — Zik){Zje — Zikf^ 



n Ni 

{nENY^ J2 4(T^k,Zik)W, B) 

1=1 k=l 



hi Fp{T,u,Zik)Bhl 
B'^FnTik,Z,k)hl B'^G{Z,k)B + Opih^ 



+ Op{h^) 



hi 



FBhl 




B^ 



\ B'^F^hl 2B^GB + Op{h^ + 
+ Op{h? + 5ph^), 

where F = E{{^fFp{T,Z)], G = \E{{§l^fG{Z)] , the second equahty 
follows from (B.l) and the last equality follows from Lemma D.l. 

Using the formula for matrix inverse in block form and letting t = E{{§^f]h1 
and G* = {B^GB)'^, we obtain 

/ 1 

{DPj~' = il3,B) 



) +0p{h + 6p) 



—FBG*hl 
T 2t 

, zlG*B^F^hl -G* 

^ _ '^{BG*B^F^P^ + PFBG*B^) + -BG*B^ + 0„(/i + 5b) 
T 2t 2 
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^ M _ ^(G+F^/5o^ + /3oFG+) + + 0,{h + 5,), 
T It Z 

where (5+ = Bo{BIGBo)-^B'^ is the Moore-Penrose inverse of G. □ 
Proof of (3.4). To prove 

we work on {Yik - aj3{Tji, Zji) - bi3{Tji, Zji){Tik - Tji) - dj3{Tji, Zji)j3l{Zik - 
Zji)} first. By Lemma D.3, we have 

Yij - afs{t,z) - bfs{t,z){Tij -t)- dp{t,z)(5l {Zy - z) 

dn , . ^ 
= ~-Q^ ^Py^^ + ~ ^".1 

-'-f{T.,-t)-'-^f,{Z,,-z) 
(B.2) + l^-Q^m^ - t? - hi) + - t){Z., - ~z) 

+ Op(/i2 + |5^| + |5^|/i + |5^|/i2 + |5^|2) 

+ Op(A), 

where A = E.,+a,=3 V^i^ " - z^' + \Ti, - t\\{Z,, - zf6p\ + jZj - 

-2"^P(<5/3 + Sfj)- We win now calculate the weighted sum of each term in (3.2). 
This leads to the following results, 
(i) Let 



Son-(nEiV) 2^2^^^ J 

3=1 1=1 

n Ni 

Kh{Tik — Tji, Zik — Zji) 



(nEiV)-^^5^ 



X 

i=l k=l 

X {Zik - Zje)uJ{Tj£, Zji) 
^' dn/dzdp{Tji,Zji) 



j=l£=l -^2 
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j=i e=i J 2 

5/i 



X uJ{T,,, Z,t) ( + Op{h + 6^))+ Op(n-i/2) 



= G + 0p(n-i/2). 
(ii) If we let 



d/^iTji, Zji) 

i=l k= 

X (-^jfc — Zji)eik, 



then 



iV„ = (nEiV)-i E E E E ^^(^^^ - ^^'^ " 

i=i k=i j=i 1=1 

3^ ^ dfj{Tji,Zj(,){Zik - Zji) 

h 

n 

i=l k=l 

where 

Iv^V^Tv-^T- T-v 7 .dis{Tjc,Zji){Zik - Zji) 



~ ^^^FaF E E ^h{Tik - Tji, Zik - Zji) 



j=i e=i J 2 



(B.4) 



nEN 

(a/x/ai) + Op(/i + |^ffl) 



^ E E ^hi'^ik - Tji, Zik - Zji){Zik - Zji) 

j=l i=l 



{Z, - E{Z\T = Tik,Z^^ = ZlP)) 1^ + Op{h + \5p\) 



-Vj3{Tik, Zik) -qZq + Op{h + \6i3\). 
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Plugging (B.4) into (B.3), we will get 

5z0 



n Ni , „ 

Nn = (nETV)-! Yl -^piT,k,Zik) ^ + Op{h + \6is\)) e^fc 



i=l k=l 



X 



(iii) From the definitions of £n,i, ^n,2 and en,3, we obtain 

£n,l=Op{h^), £n,2 = Op{h?) and £n,'3 = Op{h?). 

Let Rn = (nEiV)-i ^ •=! E£i ;s]v ELi Ef=i mTik - Tje, Z,, - Z^,) 

{Z,k - Zjt) (en,i + %f {T^k - Tji) + '^P^iZik - Zj,)) and thus Rn = Op{n~^'^). 
(iv) 

^ ^ Z^Z^ nEiV f„ 

X ^ ^ Kh{Tik — Tji, Zik — Zji){Zik — Zje){{Tik — Tjif — hf} 

1=1 k=l 

= (nEiV)- E E 5 X OAh') = Op{n-y\ 



r- 



=1^=1 



2 



(v) 



^ ' l^l^ nEiV f„ 

j=l£=l J 2 



n N, 

X 'Y^^Y^Kh{Tik — Tji,Zik — Zji){Zik — Zj£){Tik — Tje)fF{Zik — Zje) 



i=l k=l 

= o,(n-V2). 

(vi) 



n. Af,; 

X 'Y^^Y^Kh{Tik — Tji, Zik — Zji){Zik — Zji)[{0^ {Zik — Zje)Y — hj.] 



i=l k=l 

= Op{n-'/'). 
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From (i)-(vi), the weighted sum of (B.2) becomes 
T = G(/3 - /3o) + Nn + Rn + Op{n-'/^) 



= Gi/3 - /3o) - (nEiV)-i E ["/^o iT^k, Zik) ^ J e^^ + Op(n-i/2) 
and (3.4) is thus proved. □ 



APPENDIX C: PROOF OF THEOREM 3.2 

In this Appendix, we consider Yij as the jth. observation of ith subject 
made at a random time Tij with a univariate longitudinal covariate Zij, 
where i = 1, . . . ,n and j = 1, . . . , A'j. The following definitions are needed to 
derive the asymptotic normalities of two-dimensional scatter plot smoothers. 

A two-dimensional kernel function K2 : — >■ M is of order {v, k) if 

u''^v''^K2{u,v)dudv 

(C.l) 

{0, 0<ki + k2<K,ki^i^i,k2^iy2, 

ki = i^i,k2 = iy2, 
/ 0, ki + k2 = K, 

where is a multi-index 1/ = (2^1,1^2) and ji'l = i^i -|- z^2- Also, define the 
inverse Fourier transform of K2{u,v) by 

(i{t,z) = J J ex.p{—{iut + iwz))K2{u,w)dudw. 

Further, given an integer Q >l and for q = 1, . . . ,Q, let '(/'g : — t- M sat- 
isfy: 

B.l il)q{t,z,yys are continuous on {{t,z)}, uniformly in 7/ G M; 
B.2 the functions Qfprg^i^qit: z,y) exist for all arguments {t,z,y) and are 
continuous on {{t,z)}, uniformly in y G M, for pi + P2 = P and < 

Pl,P2<P- 

The kernel-weighted averages for two-dimensional smoothers are defined 

as 

t 1=1 ]=l 

(C.2) 

where K2 is a kernel function of order (z^, k) and ht and are bandwidths 
associated with t and z, respectively. The property of asymptotic normality 
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of the local linear estimator fi{t,z) can be shown by using four specific ipq 
functions. Let 

^i(t,z) = ^^^^ ^^^^ J ^Pg{t,z,y)Mt,z,y)dy 

and 

|2 



aqrit,z) = J ijq{t,z,y)i;rit,z,y)f3{t,z,y)dy\\K2\\ , 

where fs{t, z, y) is the joint density of (T, Z, Y), \\K2\\'^ = / i^l and 1 < r < 
Q. 

Lemma C.l. Under assumptions A.2-A.6, B.1-B.2, ht^hz^h, h^Q, 

< oo, 



nENhf'+'hf'+'ii^in, . . . , "fQuY - ■ • • ,IE^q™)' ] ^ N{0, S). 

Proof. This lemma can be shown by following similar procedures as 
used to prove Lemma C.l in Jiang and Wang (2010). The only difference is 
in the change-of- variable step of showing that Q2 = o(l). □ 

The following two lemmas can be justified easily by following the proce- 
dures in Jiang and Wang (2010) and thus we omit the proof. 

Lemma C.2. Let H -.W^ — t- M 6e a function with continuous first order 
derivatives, DH{v) = {^H{v), . . . , ^H{v)f and N = lYJ}=iNi. Un- 

der assumptions A.2-A.6, B.1-B.2, ht^^hz^^h, h^O, nE{N)h\''\+'^ 00, 
E(iV)/i2 ^0, ^ ^ and nE{N)hf'^+'^ for some < p^, < 00, 



^ NipH, [DH{ai, aQ)]^^[DH{ai,. . . , ag)]), 
where T, = {(Tqr)i<q^r<i and 

^Ti^ j s\^s''^'K2{si,S2)dsidS2 



ki+k2=K 



X 



'3 QJ^ Qkl+k2-Vl-V2 ^ I 



.9=1 ^ 



FUNCTIONAL SIMS 



25 



Lemma C.3. Under the same assumptions as Lemma C.2, together with 
the assumption that the inverse Fourier transform C,i{t,z) is absolutely in- 
tegrable, 

sup I'^qn- aq\ = Op( , ) where k >i kt >i ■ 

Proof of Theorem 3.2. The theorem can be justified easily by em- 
ploying Lemmas C.2 and C.3, and following the procedures used to prove 
Theorem 3.2 in Jiang and Wang (2010). □ 



APPENDIX D: AUXILIARY RESULTS 

Lemma D.l. Suppose {Tj,Zj, Yj} are from an i.i.d. sample, where Tj = 
(Til, • • ■,TiNj, Zj = {Zii, . ..,ZiNj andYi = (Ya, . . . , YjatJ. Let 7ps{t, z,y) be 
a series of functions and assume that KlipsiT, Z,Y)} and varlipsiT, Z,Y)} 
are both finite. Let ijF = {ipi, . . . ^ipp) and = (^'i, . . . , ^p), where = 
TkN^'i=i^k=i'^s{Tik,Zik,Yik) for s = l,...,p. Under assumptions A.l- 
A.6, we obtain 

(D.l) V^{*-E(M/)}^iVp(0,5]), 



where 



^ E{^P{T,Z,Y)i,^{T,Z,Y)} 



E{N) 

+ ^^^^^miT,Z,Y)i;^iT',Z',Y')} 

-E{^{T,Z,Y)}E{^'^iT,Z,Y)}. 
Equation (D.l) implies that 

n Ni 

E E ^iTik,Zik, Yik) = E{^(r, Z, Y)} + Op(n-i/2) 

i=l k=l 



Proof. We can prove (D.l) by showing that ^/n{a^'^ — a^E(^')} 
A'p(0, a'^Tio), where = {ai, . . . , ap), by the central limit theorem. □ 

Lemma D.2. Let 
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Suppose that E(y|r = t,Z'^(3 = z) = m{t,z) and that assumptions A.1-A.6 
hold. Then, 

d 

^(3it,z) =m{t,z)f2{t,z)^aua2 + , ^)/2 , ^;)}Cai+l,a2 



+ ^{?7T-(i,5)/2(t,5)}CQi,a2+1^2 + C'p(/l ) orO. 



where Cai,a2 = / K{u,v)u°'^v°^'' dudv. 

Proof. From the definition of expectation and by tlie techniques of 
change-of-variables and Taylor's expansion, we have 

X yf3is,u,y)dsdudy 

K{vi,V2)v'^^V2^m{t + viht, z + V2hz) 

X f2{t + viht,z + V2hz) dvi dv2 
d 

= m{t,z)f2{t,z)^ai,a2 + -g^{m{t, z) f2{t, z)}Cai+l,a2ht 

d 

+ Qz{m{t, Z)f2{t, z)}Caua2+lhz + 0{h^). 

The lemma now follows by Lemma C.l. □ 

In the following lemma, we study the asymptotic expansions of the weighted 
least-squares estimator, 6'^{t,z) = {ai3{t,z),hp{t^z),dp{t^z)), of the local lin- 
ear smoother for mean function n{t,(3'^z) when an initial single index (3 is 
given. Thus, 

n Ni 

9{t,z) = argmin^^K";,(rij - t,Zij - z) 
^ i=i j=i 

X {Yij -ais- bi3{Tij -t)- di3{Zij - z)f 

(D.2) 

^ n Ni 

= — 3- — - Kh{Tij - 1, - z) 



ht hz 



I], 
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where 



nEiV 

i=l 3=1 



/it ' hz / V ' /it ' /iz 



Lemma D.3. Under assumptions A.1-A.6, 



3/i 



h^{t, z)ht = —ht + — — - — 5i3ht + en,2 

+ Op{h^ + \6p\h^ + \6p\h^ + \6p\^h), 

c/^(t, zj/i^ ^ '92 'dS — 5i — ^""'^ 

+ Op(/l3 + |<5;3|/l2 + |5;3|/l' + |5/3|'/l), 

w/iere 5p = (3Q-l3, T.i = T.^{t,z), i^pit, z) =E{Z\T = t, /3 = z^ /3) - z, all 
the derivatives of fi{t,z) are evaluated at {t,zP) and 



in. 



n N, 

ENf2r^Y.Y.^hmj-t,z 

i=i j=i 



1 



ij - z) \ {Tij - t)/ht 



Proof. By Lemma D.2, we obtain 

S^(/,z) 



V as 









/2 



+ Op(/l2), 



det{S^(t,z)} = (/2f + 0p(/i2) 



and 



/2(t,5) 



f2{t,z) 





_dh 
dt 
_dh 

\ dz ' 



' dt 




dz 








as 
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+ Op{h'). 

By Taylor's expansion at (t, Yij = fi{Tij , Zfj (3q) + e^j can be expressed 

Y^J = Kt, ~Z') + ^ + % -~') + ^0 - -fh 

' V ' ^ V ' 

Si Ea 

1 5V irj. ,^2 , ,rr 



^3 ^4 

2, 



2 



where = Eai+a,=3 l^*.' " ^M^"- " A^^' + l^^i " ^IK^ii " + \Zf.- 

z-^P(|(5/3| + and the derivatives oi ii{t, z) are evaluated at {t^zP). There- 
fore, 6{t,z) in (D.2) is the sum of the weighted averages of Ei,. . . ,Eq and 
error €ij. After evaluating the weighted average of each term, which amounts 
to smoothing each Ei and Cij, the lemma follows by combining all of the 
smoothing terms. □ 
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